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We show that the noncontextual inequahty proposed by Klyachko et al. [Phys. Rev. Lett. 101, 
020403 (2008)] belongs to a broader family of inequalities, one associated to each compatibility struc- 
ture of a set of events (a graph), and its independence number. These have the surprising property 
that the maximum quantum violation is given by the Lovasz i?-function of the graph, which was 
originally proposed as an upper bound on its Shannon capacity. Furthermore, probabilistic theo- 
ries beyond quantum mechanics may have an even larger violation, which is given by the so-called 
fractional packing number. We discuss in detail, and compare, the sets of probability distributions 
attainable by noncontextual, quantum, and generalized models; the latter two are shown to have 
semidefinite and linear characterizations, respectively. The implications for Bell inequalities, which 
are examples of noncontextual inequalities, are discussed. In particular, we show that every Bell 
inequality can be recast as a noncontextual inequality a la Klyachko et al. 
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Introduction. — Recently, Klyachko, Can, Binicioglu, 
and Shumovsky (KGBS) [l| have introduced a noncontex- 
tual inequality (i.e., one satisfied by any noncontextual 
hidden variable theory), which is violated by quantum 
mechanics, and therefore can be used to detect quan- 
tum effects. The simplest physical system which exhibits 
quantum features in this sense is a three-level quantum 
system or qutrit The KCBS inequality is the 

simplest noncontextual inequality violated by a qutrit, 
in a similar way that the Clauser-Horne-Shimony-Holt 
(CHSH) inequality is the simplest Bell inequality vi- 
olated by a two-qubit system. 

The KCBS inequality has been recently tested in the 
laborator y |6| and has stimulated many recent develop- 
ments [7h13||. It can adopt two equivalent forms. Con- 
sider 5 yes-no questions Pi [i ~ 0, . . . , 4) such that Pj 
and Pj+i (with the sum modulo 5) are compatible: both 
questions can be jointly asked without mutual distur- 
bance, so, when the questions are repeated, the same 
answers are obtained; and exclusive: not both can be 
true. One can represent each of these questions as a ver- 
tex of a pentagon (i.e., a 5-cycle) where the edges denote 
compatibility and exclusiveness. What is the maximum 
number of yes answers one can get when asking the 5 
questions to a physical system? Clearly, two, because of 
the exclusiveness condition [l^ . If we denote yes and no 
by 1 and 0, respectively, then, even if one asks only one 
question to each one of an identically prepared collection 
of systems, and then count the average number of yes 
answers corresponding to each question, the following in- 
equality holds: 
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/?:=E(^^)^2, (1) 

i=0 



if we assume that these answers are predetermined by 
a hidden variable theory. This is the first form of the 
KCBS inequality. What has ([T]) to do with noncontextu- 
ality? Noncontextual hidden variable theories are those 
in which the answer of Pj is independent of whether one 
ask Pj together with Pj-i (which is compatible with Pj), 
or together with Pj+i (which is also compatible with Pj). 
A set of mutually compatible questions is called a con- 
text. Since, Pj+i and Pj-i are not necessarily compati- 
ble, {Pj,Pj-i} is one context and {Pj,Pj+i} is a different 
one, and they are not both contained in a joint context. 
The assumption is that the answer to Pj will be the same 
in both. 

Now, let us consider contexts instead of questions, i.e., 
let us ask individual systems not one but two compati- 
ble and exclusive questions. In the pentagon, a context 
is represented by an edge connecting two vertices, so we 
have 5 different contexts. In order to study the correla- 
tions between the answers to these questions, it is useful 
to transform each question into a dichotomic observable 
with possible values —1 (no) or 4-1 (yes), so when both 
questions give the same answer the product of the results 
of the observables is -1-1, but when the answers are differ- 
ent then the product of the results of the observables is 
— 1. For instance, this can be done by defining the observ- 
ables Ai = 2Pi — 1. Then, inequality (H]) is equivalent to 
the noncontextual correlation inequality, the second form 
of KCBS, 
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/3' J2^AA,+,) > -3, (2) 

2 = 

which can be derived independently based solely on the 
assumption that the observables Ai have noncontextual 
results —1 or +1. I.e., we do not need to assume exclu- 
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siveness to derive it, effectively because the occurrence of 
correlation functions (AiAi^i) implements a penalty for 
violating exclusiveness. 

For a qutrit, the maximum quantum violation of in- 
equality (dl was shown to be /3qm(5) = « 2.236, 
which is equivalent to a violation of inequality ^ of 
(3'qi/[{5) = 5 — 4\/5 « —3.94. The maximum violation of 
the KCBS inequality occurs for the state {tp\ = (0,0, 1) 
and the questions Pi = \vi){vi\ or the observables Ai — 
2\vi){vi \ — 1, where 

(i;o| =iVo(l,0,v/cos(7r/5)), 
{vi^4\ = Ni (cos (47r/5), ± sin (47r/5), ^cos (tt/S)) , (3) 
{V2,3\=N2 (cos (27r/5), T sin (27r/5), ^cos (^/5)) , 

the iVj; being suitable normalization factors. These vec- 
tors connect the origin with the vertices of a regular 
pentagon. Interestingly, with this choice, (AiAi^i) = 
[-1 + 3 cos (tt/S)] sec2 (tt/IO) /2, for i = 0, . . . , 1. Ob- 
serve that {vi\v,+i) = and ^qm(5) = E^mods 
The vectors that give /3qm(5) form an orthonormal rep- 
resentation of the 5-cycle. 

General compatibility structures. — The KCBS inequal- 
ity suggests itself a generalization to arbitrary graphs in- 
stead of the pentagon. Most generally and abstractly, 
Kochen-Specker (KS) theorems [1] are about the possi- 
bility of interpreting a given structure of compatibility 
of "events," and additional constraints such as exclusive- 
ness, in a classical or nonclassical probabilistic theory. In 
this paper, these events are interpreted as atomic events, 
each of which can occur in different contexts. Formally, 
the events are labelled by a set V (in practice finite, and 
often just integer indices, V = {1, 2, . . . , n}). The set of 
all valid contexts is a hypergraph F, which is simply a 
collection of subsets C C V; note that for hypergraphs 
of contexts, with each C € F, all of the subsets of C are 
also valid contexts, and hence part of C. The interpreta- 
tion is that there should exist (deterministic) events in a 
probabilistic model, one Pi for each i G V, and for each 
context C a measurement among whose outcomes are the 
Pi {i € C). The events are hence mutually exclusive, as 
in the measurement postulated to exist for some C S F, 
at most one outcome i € C can occur. For instance, a 
classical (noncontextual) model would be a measurable 
space ri, with each Pi being the indicator function of a 
measurable set (an event, in fact) such that for all C G F, 
EjgP Pi < 1 (i.e., the supporting sets of the Pi should 
be pairwise disjoint). 

In contrast, a quantum model requires a Hilbert space 
v. and associates projection operators Pi to all i £ V, 
such that for all C € F, J2i£C Pi ^ ^ {i.e., the Pi can be 
thought of as outcomes in a von Neumann measurement). 

Thanks to KS we know that quantum models are 
strictly more powerful that classical ones; but they are 



still not the most general ones. A generalized model 
requires choosing a generalized probabilistic theory in 
which the Pi can be interpreted as measurement out- 
comes: following 15 -l^, formally it consists of a real 
vector space A of observables, with a distinguished unit 
element u £ A and a vector space order: the latter is 
given by the closed convex cone T' C ^ of positive ele- 
ments containing u in its interior, such that V spans A 
and is pointed, meaning that, with the exception of 0, T' 
is entirely on one side of a hyperplane. For two elements 
X,Y € Awe then say X <Y if and only ifY - X €V. 
(We shall only discuss finite dimensional A, otherwise 
there will be additional topological requirements.) The 
elements with < E < u are called effects. This struc- 
ture is enough to talk about measurements: they arc col- 
lections of effects {El, . . . , Ek) such that X]j'=i = 

[Observe how we recover quantum mechanics when P 
consists of the semidefinite matrices within the Hermi- 
tian ones over a Hilbert space, and u = 1. Classical 
probability instead, when V are the non-negative func- 
tions within the measurable ones over a measure space, 
u being the constant 1 function.] 

Now, a generalized model for the hypergraph F is the 
association of an effect e „4 to each i G V, such that 
each Pi is a sum of normalized extremal effects, and for 
all C S F, J2iec Pi — "^^^ latter condition ensures 
that the family {Pi : i G C) can be completed to a mea- 
surement, possibly in a larger space A D A. We finally 
demand that this can be done such that also u — ^^^^ Pi 
is a sum of normalized extremal effects. 

Notice that in all of the above we never require that 
any particular context should be associated to a com- 
plete measurement: the conditions only make sure that 
each context is a subset of outcomes of a measurement 
and that they are mutually exclusive. Thus, unlike the 
original KS theorem, it is clear that every context hy- 
pergraph F has always a classical noncontextual model, 
besides possibly quantum and generalized models. This 
is where noncontextual inequalities come in: note that 
all of the above types of models allow for the choice of a 
state (be it a probability density, a quantum density op- 
erator, or generalized state), under which all expectation 
values (Pi) make sense, and hence also the expression 



/3 = E(^^)- 



(4) 



i£V 



Moreover, all probabilities (Pi) arc independent of the 
context in which Pi occurs, as they depend only on the 
effect Pi and the underlying state. Since this is the condi- 
tion underlying Cleason's theorem, we call it the Gleason 
property. 

We can then ask for the set of all attainable vectors 
{(P'i))i£V S^'^'^^ hypergraph F, over all models of a 
given sort (classical noncontextual, quantum mechanical, 
or generalized probabilistic theory) and states within it. 
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These are evidently convex subsets in [0,1]^ C M^; we 
denote the sets of noncontextual, quantum and gener- 
ahzed expectations by fc(r), fQM(r) and £gpt{T), re- 
spectively. The central task of the present theory is to 
characterize these convex sets and to compare them for 
various F. This is because a point p G fx(r) in any 
of these sets describes the outcome probabilities of any 
compatible set of events {i.e., any context). Note that all 
of them are corners in the language of [20| : if < < pi 
for all i €V, then p G fx(r) implies also q S fx(r). 

In particular, the extreme values of /3 over these sets 
are denoted /3c (F), /3QM(r), and /3GPT(r), respectively. 
It is clear that 



/3c (r) < /3QM(r) < /3GPT(r) 



(5) 



by definition. 



Maximum values. — Prepared by the above discussion, 
for given hypergraph F, we can define the adjacency 
graph G on the vertex set V: two i,j € V are joined 
by an edge if and only if there exists a C G F such that 
both i,j e C. Then, 



/3c(r) = a(G), /3qm(F) = i9(G), 



(6) 



where a{G) is the independence number of the graph, i.e. 
the maximum number of pairwise disconnected vertices, 
and '!?(G) is the Lovasz iJ-function of G (20l - [2^ . defined 
as follows: First, an orthonormal representation (OR) of 
a graph is a set of unit vectors associated to the vertices 
such that two vectors are orthogonal if the corresponding 
vertices are adjacent. Then, 



d{G) :=max^|(^it;,)| = 



(7) 



where the maximum is taken over all unit vectors 
Eucledian space) and ORs {1?;^) : i = 1, . . . , n} of G [2 
Note that on the right hand side, we can get rid of 
by observing 




max^ 

2—1 



El 



Furthermore, i?(G) is given by a semidcfinitc program 



21[ , which explains the key importance of this number 
for combinatorial optimization and zero-error informa- 
tion theory - indeed t?(G)js an upper bound to the Shan- 



non capacity of a graph [21 



Observe that this says in particular that when dis- 
cussing classical and quantum models, we never need to 
consider contexts of more than two events. Indeed, it is 
a (nontrivial) property of these models that if in a set 
of events any pair is compatible and exclusive, then so is 
the whole set; more generalized probabilistic theories do 
not have this property, cf. . 



To prove Eq. (|6]), we notice that for a given proba- 
bilistic model, the expectation is always maximized on 
an extremal, i.e. pure, state. In the classical case, this 
amounts to choosing a point a; € 51, so that Wi := Pi{uj) 
is a 0-1-valuation of the set V . By definition, it has the 
property that, in each hyperedge G £ F, at most one ele- 
ment is marked 1 , and /? is simply the number of marked 
elements. It is clear that the marked elements form an 
independent set in F (and equivalently in the graph G). 
In the quantum case, let the maximizing state be given 
by a unit vector |?/;), and for each i, {il}\Pi\il}) = 
for \vi) := Pi|-0) / ilj\Pi\'il}) . This clearly is an orthog- 
onal representation of G, in fact the projectors 
form another quantum model of F, with the same maxi- 
mum value of /3, which by the definition we gave earlier 
is just Lovasz' ■d[G). 

Each graph G where a(G) < i?(G) thus exhibits a lim- 
itation of classical noncontextuality, which can be wit- 
nessed in experiments with an appropriate set of projec- 
tors, and on an appropriate state. In this sense, each 
such graph provides a proof of the KS theorem. 



Taking n > 5 odd and applying a result from [21 1 
to G = G„, the n-cycle, one obtains the noncontextual 
quantum bounds 

a ( \ 'itn \ "COS jn/n) 
1-1- cos (TT/n) 

where Gn denotes the n-cyclc. After some algebra, the 
quantum bound for the analogue of ^ can be written as 



-l + Scosf-'jjscc^f— y (10) 



for all state space dimensions larger or equal to 3; the 
same result was obtained recently by Liang, Spekkens, 
and Wiseman 24 1 . 

We remark here that there are also "state- 
independent" KS proofs 0, [H, [2^ : these are given by 
quantum noncontextual models of a graph G such that 
J2i{-Pi) > foi' every state. The proofs in the liter- 

ature typically have this property, as they are based on 
rank-one Pi = \vi){vi\, and for each j E V there exists 



(8) G e F such that j eG and J^iec Pi 



U (i.e., each Pj is 
part of a context that is already a complete measurement; 
the \vi) forming a complete orthonormal basis). Due to 
the symmetric structure of most KS proofs, Pi turns 
out to be proportional to the identity, so /3 is independent 
of the state. 

It is known that t9(G) can be much larger than a{G)] 
in particular, it is known that (for appropriate, arbi- 
trarily large n) there are graphs G with 'd{G) ~ y/n 
but a{G) ss 21ogn, and others with 'd{G) ~ -^fn but 
a{G) = 3 [13]. Hence, the quantum violation of noncon- 
textual inequalities can be arbitrarily large. 

Description of the probability sets. — We now show that 
arbitrary linear functions can be optimized over £qm(F) 
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as semidefinite programs: for an arbitrary vector A € , 
let 

A(£QM(r)) =niax^AiPi s.t. peSquiT). (11) 



First of all, without loss of generality, all A^ are non- 
negative; this follows because Squi^) is a corner and 
hence A(£QM(r)) is unchanged when we replace all nega- 
tive Xi by 0. Now recall that pi = \ {ip\vi)\'^ for some unit 
vector 1-0) and an orthonormal representation {\vi) oc 
Pi\Tjj)} of G. Hence, 




= (*l (E v^(«^h)K)01| 1^) ^^^^ 

\ijev J 

ij£V 

= trTA. 

for an appropriate vector \t) £ CX , because the Hermi- 
tian matrices in the second and third line (the latter a 
Gram matrix) have the same spectrum. The matrices T 
and A in the last line are defined as follows: 



Utj{vi\vj). 



When varying over quantum models of G and states ip, 
the matrix T varies over all semidefinite T > such that 
tr T = 1 and Tij = whenever i ^ j are connected by 
and edge in G. I.e., 

A(fQM(r)) = maxtrAT 

s.t. T > 0, trT = 1, 



T 

J- 7 1 



which is indeed a semidefinite program. 



0, 
(13) 

□ 



Closing this semidefinite discussion, the above primal 
SDP above has a dual, as follows: 



X{£qm{G)) = min s s.t. si > 5, 5 = 5^ 

(iT^j or i=j) Sij = Aj 



(14) 



The value A(ipQM(r))is known as a weighted Lovdsz num- 
ber (or •d-function) [20|. 

The previous discussion implies that not only function 
optimization, but also membership in fQM(r) is an ef- 
ficient convex problem: there is a polynomial-time algo- 
rithm that, given a vector p, tests whether it is in £QM(r) 



or not. This follows from general considerations of con- 
vex optimisation [IT 
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Does there exist such a nice and efficient description 
also for the classical set £c(r)? The fact that the maxi- 
mum of (5 over it is the independence number a{G), which 
is well-known to be NP complete, means that the answer 
is "no." In fact, fc(r) encodes the independence num- 
bers a{G\s) of all induced subgraphs of G on subsets 
S <Z V , and the best description that we have is as the 
following 0-1-polytope: 



fc(r) =conv{CT : a, G {0,1}, 



a,(T,=0}. (15) 



Turning to generalized probabilistic models, /3GPT(r) 
seems at first much harder to characterize, and we need to 
look at the full hypergraph structure. Indeed, it is this 
value that we should with good reason consider as the 
"algebraic bound" for /3. After all, it is the largest value 
we can assign to it under the most general interpretation 
of the events i G F in a probabilistic model that obeys 
the Gleason property. 

The difficulty in evaluating /3GPT(r) lies in capturing 
the constraint that the Pi have to be sums of extremal, 
normalized effects in the generalized probabilistic theory. 
If we relax this condition simply to Pi having to be an 
effect, we arrive at what we would like to call a fuzzy 
model, which formalizes the notion that all {Pi : i G C} 
are compatible, but not necessarily exclusive events: so 
we are left with Gleason's constraints < (Pi) < 1 and 
for all C G r, '^i^uiPi) < 1. Denote the (convex) set 
of all expectations {{Pi))^^Y when varying over models 
and their states by £F(r)- 



/3GPT(r) = /?F(r) = a*(r), 



(16) 



where a*(r) is the so-called fractional packing number 
of the hypergraph F, defined by the following intuitive 
linear program: 



a* (F) = max Wi 

i<£V 

s.t. Vi < u;, < 1 and VC G F ^Wi<l. 



(17) 



The vectors w are known as fractional packings ofT. To 
prove Eq. ()16p . observe on the one hand that, for given 
fuzzy model {Pi} and a state p, the weights m; = (Pi) 
form a fractional packing. Furthermore, a fractional 
packing {wi} is a fuzzy noncontextual model for the 
unique generalized probabilistic theory in M, with the 
usual ordering and unit 1; the state is the identity. (In 
other words. SpiX) is precisely the polytope of fractional 
packings of F.) 

Conversely, given a fractional covering w, we now show 
that there is an appropriate generalized probabilistic 
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model with effects Pi and a state, such that Wi = (Pi). 
Indeed, as the set of normalized states we choose S = 
1 © fF(r), spanning a cone R>oS C K © M^. The dual 
cone (with respect to the usual Euclidean inner product) 
is the set of positive observables: S' V C M©K^ with 
unit element u = 1 © 0^ € 7^, which is 1 precisely on the 
afRne hyperplane spanned by S. Now, for each i G V, let 
Pi = 0(BSi G V he the i-th standard basis vector. Clearly, 
for given fractional covering {i.e., state) w and all i G V, 
(Pi) = Wi. Hence, all that remains to show is that these 
Pi and all Qc = ^~X]iGC extremal and normalized 

(assuming that C € F is a maximal element). Concern- 
ing normalization, observe that the fractional packings 6i 
and (the all-zero assignment) yield proper states. Re- 
garding extremality, observe that on S, Pi and all Qc 
are non-negative; furthermore, the equations (Pi) = 
and (Qc) = each define hyperplanes intersecting R>o5 
in a convex set of dimension \V\, i.e. these equations de- 
fine facets of the cone R>o5, meaning that all R>oi^i and 
M.>oQc are indeed extremal rays. 

Note that by the above argument we proved in fact that 
iS'GPT(r) = £F(r), the set of fractional packings. This 
means that any linear function of expectation values can 
be optimized over f gpt (r) as a linear program; likewise, 
checking whether p is in £Qpx(r) is a linear programming 
feasibility problem. □ 

For an example, for the n-cycles above, a*(C„) = n/2, 
regardless of the parity of ?i, which is strictly larger than 
'i?(C'n) for all odd n > 5. Again, we know of arbitrar- 
ily large separations: there are hypergraphs F such that 
the adjacency graph G is the complete graph Kn, hence 
a{G) = 7?(G) = 1, yet a*(r) > 1 

Remark: Our £qm(F) equals Knuth's set TH(G) ^ for 
the adjacency graph G of F; likewise our Sq (F) equals his 
STAB(G) and if F is the hypergraph of all cliques in G, 
also £gpt(F) = qSTAB(G). Knuth introduced these sets 
in his treatment of the (weighted) Lovasz i9-function, in- 
dependence numbers and fractional packing numbers, in 
an attempt to explain the so-called "sandwich theorem" 
structurally. 

Bell inequalities. — Where docs nonlocality come into 
this? After all. Bell inequalities exploit locality in the 
form that one party's measurement is compatible with 
another party's, and that the former's outcomes are in- 
dependent of the latter's choices {i.e., insensitive to dif- 
ferent contexts). We can model this also in our setting, 
by going to the atomic events, which are labelled by a list 
of settings and outcomes for each party. For instance, for 
bipartite scenarios, let Alice and Bob's settings he x G X 
and y G y, respectively, and their respective outcomes 
be a G ^ and b G B. Then, we construct a graph with 
vertex set V = AxBxXxy and edges abxy ^ a'b'x'y' 
if and only if {x ~ x' and a ^ a') or {y ~ y' and b =/= b'), 
encoding precisely that two events in V are connected in 
the graph if and only if they are compatible and mutually 



exclusive (as events in the Bell experiment as a whole). 
Let F be the hypergraph of all cliques in G. 

We can now discuss classical noncontextual, quantum 
and generalized models for this graph, and hence also 
noncontextual inequalities, restricting as above to linear 
functions Xp of the vector of the probabilities Pab\xy = 
(Pabxy), with with uon-ncgative coefficient vector A. Note 
that any Bell inequality can always be rewritten in such a 
form, by removing negative coefficients using the identity 
-Pab\xy = -1 + J2a'b'^abPa'b'\xy for aU x, y, a, and b. 
These equations are not automatically realized in the sets 
£x(F), X = C, QM, GPT - as indeed in the underlying 
(classical, quantum or generalized) model it needs not 
hold that J2ab Pabxy IS the unit element, for any x, y. 
Hence, define for any class of models X = C, QM, GPT, 

£i{T) := £^{T) n |p : Vxy Y.P-b\xy = l| , (18) 

the set of probability assignments consistent with the 
contextuality structure F, and in addition satisfying nor- 
malization. 

In the appendix we prove (which is not too difficult) 
that is precisely the set of correlations explained 

by local hidden variable theories, and that fGPT(r) are 
exactly the no-signalling correlations. Furthermore, to 
calculate the local hidden variable value Vic of a given 
Bell inequality with no n- negative coefficient vector A, it 
holds that 

f7e = A(£-i(F)) = A(£c(r)). (19) 

In this sense, any Bell inequality is at the same time a 
noncontextual inequality for the underlying graph G. 

With classical and no-signalling correlations taken care 
of, we turn our attention to the quantum case. Once 
again, we refer the reader to the appendix for a proof 
that the following subset of (-T) is precisely the set of 
correlations obtainable by local quantum measurements 
on a bipartite state (where "local" means that all opera- 
tors of one party commute with all operators of another 
party): 

^SM(r) = I {{Pabxy))^,.^^ ■■ yxy Pabxy = (20) 

I.e., we add the completeness relation for the measure- 
ments in the model. This of course also means that for 
a given Bell inequality with coefficients A, the maximum 
quantum value is 

= \{£Im{^))- (21) 

For the time being we do not know whether the set of 
quantum correlations, i.e. £Qy^{r), is efficient to charac- 
terize. It follows, however, from the above considerations 
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and the general theory of convex optimization (291-31 
that the - potentially larger - set iS'QM(-r) can be de- 
cided efficiently. In fact, we shall see directly that the 
maximum values A(iJ'Qf^/[(r)) are computed to arbitrary 
precision by semidefinite programming, thus providing 
efficient upper bounds to i}q. 

Namely, for M ^ 1, consider the linear function 

X-p + MY,(-^+Y.P-b\x)j = {X + Ml)-p-M\Xxy\, 

xy ^ ab ' 

(22) 

which encodes A plus a large negative penalty for any 
xy such that ^^hVaU^xy < Ij and maximize it over the 
full set of quantum models, fQM(r). [Note that "< 1" 
is guaranteed by the Gleason property, which is valid in 
this set.] Clearly, all the values (A + A/l)(£QM(r)) are 
instances of the semidefinite programs discussed earlier, 
and as M — oo, 

(A + A/l)(£QM(r)) - M\X xy\^ XiSqui^)- (23) 

Irnplementing this for example for the CHSH inequal- 
ity [3, we recover the Tsirelson bound 2^/2 [sij - see 
the appendix for details. On the other hand, for the 
I3322 inequality [s^ the method yields the upper bound 
0.251 47 on the quantum value; the currently best upper 
bound is slightly smaller: 0.250 875 56 [3J|, from which 



we conclude that in general, 5Q]yj(r) is strictly contained 
in fQj^(r) - once more, see the appendix for details. [As 
an aside, we note that in the latter case, maximizing over 
£qm{^) gives the even much larger bound 0.4114 - so, 
unlike classical models, in the quantum the probability 
normalization is not for granted.] 

Conclusions. — Notice that the previous exposition 
bears striking similarity to the discussion of the no- 
signalling property in the context of classical, quantum, 
or more general correlations. Indeed, as it was observed 
by Popescu and Rohrlich , and Tsirelson [s^ , the no- 
signalling principle is not enough to explain the scope 
of quantum correlations; for instance, for the CHSH in- 
equality, the classical bound is 2, the quantum bound is 
2\/2, while the algebraic bound 4 is attainable under the 
most general no-signalling correlations. Likewise here: 
operational models obeying the Gleason constraint in- 
clude classical and quantum ones, but they definitely go 
beyond these two. One might ask: why is nature not 
even more contextual than quantum mechanics? 

Unlike Bell inequalities, here we see that the maxi- 
mum quantum violation is always efficiently computable, 
as it is the solution to a semidefinite program, and these 
are solvable in polynomial time. Thanks to the general 



constraints. Generalized models are captured instead en- 
tirely by linear inequalities and linear programming - in 
particular, also here all maximum violations of noncon- 
textual inequalities can be computed efficiently, as linear 
programs. At the other end of the spectrum, the non- 
contextual set £c (r) is the convex hull of many, but easy 
to describe points, but its characterisation in terms of 
inequalities is computationally hard, and so are maxi- 
mum values such as /3c (r), which can be as hard as NP 
complete. 

The sets of probability assignments compatible with 
noncontextual, quantum and generalized operational 
models are different from each other even in the simplest 
nontrivial case, that of the pentagon, as witnessed by the 
values 2, v^, and 5/2 for /3(5), respectively. Especially 
the gap between -^/S for quantum and 5/2 for generalized 
models is noteworthy, because the latter value is attained 
by putting weight 1/2 to each vertex in a Gleason assign- 
ment of probabilities to each of the five vertices of C5 . It 
had been noted by other authors before, that the Glea- 
son constraint on finite sets of vectors allows assignments 
incompatible with quantum theory |14| . We believe that 
here we clarified this observation further, since we showed 
that each such assignment originates in fact from a sound 
operational model based on generalized probabilistic the- 
ories. Each vertex is assigned an event such that, with 
respect to the given state, any adjacent pair is "com- 
plete" in the sense that the probabilities add up to 1. It 
is easy to see that quantum mechanics cannot yield this, 
as it would require successive subspace projectors to be 
orthogonal complements of each other. 

We close by highlighting some open questions: Look- 
ing back, it is the insistence on exclusivencss of events, 
and the dropping of completeness relations, that made 
the KCBS inequalities and our generalizations possible; 
not insisting on effects having to sum to unity (always 
prominent in the "usual" KS proofs) also seems respon- 
sible for the fact that we obtain a semidefinite program 
for the maximum quantum value. On the other hand, 
how to incorporate this as an additional constraint in 
the SDP? 

As this seems to mark exactly the difference between 
nonlocal quantum values and quantum violations of gen- 
eralized KCBS inequalities, the question arises: how good 
is the latter as a bound on the former? And how does 
it relate to upper bounds obtained from the Navascues- 
Pironio-Acm hierarchy [37|? 



this 



machinery of convex optimisation problems 
also means that membership of a probability assignment 
p in £QM(r) can be tested efficiently, despite the fact 
that the set is not itself defined directly by semidefinite 



Acknowledgments. — We thank P. Badzi§,g, J. Barrett, I. 
Bengtsson, T. Cubitt, A. Harrow, A. Klyachko, D. Le- 
ung, J. -A. Larsson, W. Matthews, J. Oppenheim, and K. 
Svozil for conversations. 

The present research was supported by the European 
Commission, the U.K. EPSRC, the Royal Society, the 
British Academy, the Royal Academy of Engineering, the 
Spanish MCI Project No. FIS2008-05596, and by the Na- 



7 



tional Research Foundation as well as the Ministry of 
Education of Singapore. 



073013 (2008). 



Non-locality: proofs 



* ladan@us.esl 



^ simoseve@gmail .com| 
* a.j.wintcr@bris.ac.uk 
[1] A. A. Klyachko, M. A. Can, S. Binicioglu, and A. 

Shumovsky, Phys. Rev. Lett. 101, 020403 (2008). 
[2] A. M. Gleason, J. Math. Mech. 6(6), 885 (1957). 
[3] J. S. Bell, Rev. Mod. Phys. 38, 447 (1966). 
[4] S. Kochen and E. P. Specker, J. Math. Mech. 17, 59 
(1967). 

[5] J. F. Clauser, M. A. Home, A. Shimony, and R. A. Holt, 

Phys. Rev. Lett. 23, 880 (1969). 
[6] R. Lapkiewicz et al. (unpublished). 
[7] A. Cabello, Phys. Rev. Lett. 101, 210401 (2008). 
[8] P. Badziag, I. Bengtsson, A. Cabello, and I. Pitowsky, 

Phys. Rev. Lett. 103, 050401 (2009). 
[9] G. Kirchmair et ai, Nature (London) 460, 494 (2009). 
[10] E. Amselem, M. Radmark, M. Bourennane, and A. Ca- 
bello, Phys. Rev. Lett. 103, 160405 (2009). 
[11] A. Cabello, Phys. Rev. Lett. 104, 220401 (2010). 
[12] O. Guhne et al, Phys. Rev. A 81, 022121 (2010). 
[13] A. Cabello, Phys. Rev. A 82, 032110 (2010). 
[14] R. Wright, in Mathematical Foundations of Quantum 
Mechanics, edited by A. R. Marlow (Academic Press, San 
Diego, 1978), p. 255. 
[15] G. W. Mackey, Mathematical Foundations of Quantum 

Mechanics (W. A. Benjamin, New York, 1963). 
[16] G. Ludwig, Z. Phys. 181(3), 233 (1964). 
[17] G. Ludwig, Comm. Math. Phys. 4(5), 331 (1967). 
[18] A. S. Holevo, Statistical Structure of Quantum Theory 

(Springer, Berlin, 2001). 
[19] J. Barrett, Phys. Rev. A 75, 032304 (2007). 
[20] D. Knuth, Elec. J. Comb. 1, 1 (1994). 
[21] L. Lovasz, IEEE Trans. Inf. Theory 25, 1 (1979). 
[22] J. Korner and A. Orlitsky, IEEE Trans. Inf. Theory, 44, 
2207 (1998). 

[23] L. Lovasz, Geometric Representations of Graphs, 

htt p : / / www .cs.elte.hu / ~lovasz /geomrep.pdf 
[24] Y.-C. Liang, R. W. Spekkens, and H. M. Wiseman, 

arXiv:1010.1273 [quant-ph]. 
[25] A. Peres, Quantum Theory: Concepts and Methods 

(Kluwer, Dordrecht, 1993). 
[26] A. Cabello, J. M. Estebaranz, and G. Garcia- Alcaine, 

Phys. Lett. A 212, 183 (1996). 
[27] R. Peelers, Combinatorica 16, 417 (1996). 
[28] T. S. Cubitt, D. W. Leung, W. Matthews, and A. Winter, 

arXiv;1003.3195 [quant-ph]. 
[29] M. Grotschel, L. Lovasz, and A. Schrijver, Geometric 

Algorithms and Combinatorial Optimization (Springer, 

Berlin, 1988). 

[30] D. Bertsimas and S. Vempala, J. ACM 51, 540 (2004). 
[31] Y.-K. Liu, arXiv:0712.3041 [quant-ph]. 
[32] B. S. Cirel'son [Tsirelson], Lett. Math. Phys. 4, 93 (1980). 
[33] N. Brunner and N. Gisin, Phys. Lett. A 327, 3162 (2008). 
[34] K. F. Pal and T. Vertesi, arXiv: 1006.3032 [quant-ph]. 
[35] S. Popescu and D. Rohrlich, Found. Phys. 24, 379 (1994). 
[36] B. S. Tsirelson, Hadronic J. Suppl. 8, 329 (1993). 
[37] M. Navascues, S. Pironio, and A. Acm, New J. Phys. 10, 



Here we prove the claims in the Bell inequality section. 

(i) Proof that £q{T) = local realistic correlations. If the 
A's and -B's form a (deterministic) classical local hidden 
variable model, then the products P^y = A°^By are a 
classical noncontextual model for the graph G. Since 
for each x and y there is exactly one a, h, respectively, 
such that At = B^, = 1, the normalization condition is 
fulfilled, too. 

Vice versa, given any deterministic noncontextual 
model for G we show how to construct local hid- 
den variables A% and By (taking values and 1) such 
that P!^^ < A%Bl; using the probability normalization, 
this must be an equality. Namely, assume Pabxy = 1 for 
any quadruple abxy. Then, thanks to the graph G, for 
any a' ^ a and any y and b, Pa'b'xy' = 0. In other words, 
for every x, there is at most one a such that Pab'xy' = 1 
for any b'y' . Choose this a (or else an arbitrary one) to 
let A'^ ~ 1 and all other = 0. Likewise for By, and 
we clearly obtain the claim. □ 

(ii) Proof that EQpr^iV) = no-signalling correlations. Let 
p e fGPT(r) such that for all xy, J2abPab\xy = 1. We 
have to show the no-signalling relations, 

VaxVyy' ^Pab\xy = 

Pab\xy' 1 

b b 

WbyVxx' ^Pab\xy = ^Pab\x'y 

a a 

To prove this, note for fixed x, y and y\ that the vertices 

{abxy :beB}U {abxy : a e A\a, b e 8} 
form a clique in G, hence 

^ ^^ Pab\xy ^ ^ Pa'bxy' ^ 1; 
b a'^a.b 

which implies J2bPab\xy < E bPab\xy' for arbitrary y and 
y' . By symmetry, equality must hold. □ 

(lii) Proof that A(£:^(r)) = A(£:c(r)). Recall from (i) 
that we can find, for any deterministic noncontextual 
model P^y, local hidden variables A° and By (taking 
values and 1) such that P^^ < A°;.B^. The right hand 
side is in evidently in ^^(r)). Hence, for the purpose of 
maximizing a objective function with non-negative coef- 
ficients A, we may restrict to £:^(r)). □ 

(iv) Proof that £q^{T) = quantum correlations. We face 
a problem like in (i): given operators Pabxy forming a 
quantum model of G, we have to define projector valued 
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measurements {A°;.)a£A and {By)b£B such that By] = 
and Pabxy = A%Bl. 

There are obvious candidates for these "local" mea- 
surements given as marginals of Patxy- 



^x^Yl ^^b'xy (for any y), 

b' 

= ^ Pa'bxy (for any x), 



which raises the immediate issue that, a priori, the right 
hand sides may not be independent of y and x, respec- 
tively. Denote the right hand sides above by A'^y and 
B^y. We show that the assumption of completeness, 
J2ab Pabxy = 1, implies that A'^y is independent of y, B^y 
independent of x. Indeed, observe that for any a' ^ a and 
any y, y', 6, and 6', Pabxy -L Pa'b'xy', which by summation 
implies that 

'^xy / ^ '^xy' ^ -^xy' ' 
a' 

i.e., A'^y < A°^y, for all y and y' . By symmetry we hence 
must have A'^y = A'^y, and likewise B^y — B^y, . 
Now, observe finally 

nb \ ^ p p p r>b Aa 

^x-'^y ^ / , ^ab'xy-l^a'bxy ~ ^abxy — J^y^^xi 



a'b' 



and we are done. 



□ 



(v) Example CHSH. Here, A = B = X = y = and 
A encodes the winning condition for the CHSH (or PR) 
game: 



^abxy 



1 : a(Bb = xy, 
: otherwise. 



(24) 



The CHSH inequality expresses the fact that Oc = 3 
while rjq = 2 + ^/2. 

Constructing the graph and the matrices A and T by 
hand is easy: G has 16 vertices, so the matrices are also 
16 X 16. Since A is rather sparse, this allows us immedi- 
ately to reduce it to a graph G' on 8 vertices with new 
A-matrix equal to J, the all-l-matrix. The graph is the 
(1, 4)-circulant graph on 8 vertices; one can obtain it by 
joining antipodal vertices in the 8-cycle Cs. So, we find 



that X{£qm{G)) = 'i?(G"), and the latter is easily evalu- 
ated to 2 -I- -\/2, using the dual characterisation of Lovasz 
{i.e. our dual SDP). 

(vi) Example 13322 • This is a Bell inequality for 3 settings 
for each Alice and Bob, each measurement having binary 
output. In the form found in |33j it reads 



-2(^0) - (A?) 
+ {AlB^o) ^ 



(B" 



(25) 



^0/ 

and the value is the maximum attainable under lo- 
cal hidden variables. One form of the objective function 
with non-negative coefficients, using the above substitu- 
tion trick, is X ■ p, with the vector A G M.^^ being given 
by the following table: 



xa \ yb 


00 01 


10 11 


20 21 


00 
01 


1 



1 
1 1 


1 
1 1 


10 
11 


1 1 
1 


1 
1 1 


1 

1 1 


20 
21 


1 



1 

1 1 







The classical bound is fic = 6, while the best known 
quantum violation attains a value 6.250 875 384 < fiq; 
on the other hand, it is known that flq < 6.250 875 56, 
by going as far up in the Navascues-Pironio-Acfn hier- 
archy [37| as was computationally feasible (almost the 
fourth level) ; the conjecture is that this is essentially the 
optimal value, although there is still disagreement from 
the 7th digit on. It is also conjectured that to attain the 
quantum limit, an infinitely large entangled state is re- 
quired - in [3^ a candidate sequence of larger and larger 
states and measurements is presented which give better 
and better values suggested to converge to the optimum. 

The game context graph G on 36 vertices in not con- 
structed explicitly here, though it is easy. Looking at the 
primal SDP, and noticing that only 20 out of 36 compo- 
nents of A are populated, and then only by I's, one sees 
- cf. the CHSH case - that, by constructing the induced 
subgraph G' of the context ^raph on the 20 vertices abxy 
with Xabxy = 1, we obtain A(5qm(G')) = z?(G") « 6.4114. 

This is an instance of the probabilities simply not 
adding up to 1, in other words: X{£q^{G)) is strictly 
smaller. Indeed, a calculation with on ScDuMi resulted 
in X{£^y^{G)) « 6.25147. 



